Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization

نویسندگان

Shweta Ghai

Rohit Sinha

چکیده

Most commonly used model adaptation techniques employ linear/affine transformation on models/features to address the gross acoustic mismatch between the adults’ and the children’s speech data. Since all sources of acoustic mismatch may not be appropriately modeled by just linear transformation, in this work, the efficacy of our recently proposed explicit acoustic (pitch and speaking rate) normalization in combination with the existing normalization/adaptation techniques is explored for mismatched children’s speech recognition. The study shows that explicit normalization of pitch and speaking rate of children’s speech further improves the effectiveness of the adaptation methods. With explicit acoustic normalization significant relative improvements of 13% and 5% are obtained over that obtained with combined VTLN and CMLLR for children’s speech recognition on adults’ speech trained models for connected digit and continuous speech recognition tasks, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring the Effect of Differences in the Acoustic Correlates of Adults' and Children's Speech in the Context of Automatic Speech Recognition

This work explores the effect of mismatches between adults’ and children’s speech due to differences in various acoustic correlates on the automatic speech recognition performance under mismatched conditions. The different correlates studied in this work include the pitch, the speaking rate, the glottal parameters (open quotient, return quotient, and speech quotient), and the formant frequencie...

متن کامل

Pitch-Adaptive Front-End Features for Robust Children's ASR

In the presented work, we explore some of the challenges in recognizing children’s speech on automatic speech recognition (ASR) systems developed using adults’ speech. In such mismatched ASR tasks, a severely degraded recognition performance is observed due to the gross mismatch in the acoustic attributes between those two groups of speakers. Among the various sources of mismatch, we focus on t...

متن کامل

A Study on the Effect of Pitch on LPCC and PLPC Features for Children's ASR in Comparison to MFCC

In this work, following our previous studies, we study and quantify the effect of pitch on LPCC and PLPC features and explore their efficacy for children’s mismatched ASR in comparison to MFCC. Our analysis shows that, unlike MFCC, LPCC feature has no major influence of pitch variations. On the other hand, similar to MFCC, though PLPC is also found to be significantly effected by pitch variatio...

متن کامل

Investigating recognition of children's speech

In this work recognition of children’s speech was investigated by considering a phone recognition task. Two baseline systems were trained, one for children and one for adults, by exploiting two Italian speech databases. Under matching conditions, training and recognition performed with data from the same population group, the phone recognition accuracy was 77.30% and 79.43% for children and adu...

متن کامل

On the development of matched and mismatched Italian children's speech recognition systems

While at least read speech corpora are available for Italian children’s speech research, there exist many languages which completely lack children’s speech corpora. We propose that learning statistical mappings between the adult and child acoustic space using existing adult/children corpora may provide a future direction for generating children’s models for such data deficient languages. In thi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization

نویسندگان

چکیده

منابع مشابه

Exploring the Effect of Differences in the Acoustic Correlates of Adults' and Children's Speech in the Context of Automatic Speech Recognition

Pitch-Adaptive Front-End Features for Robust Children's ASR

A Study on the Effect of Pitch on LPCC and PLPC Features for Children's ASR in Comparison to MFCC

Investigating recognition of children's speech

On the development of matched and mismatched Italian children's speech recognition systems

عنوان ژورنال:

اشتراک گذاری